GERBIL: Genotype resolution and block identification using likelihood.
نویسندگان
چکیده
The abundance of genotype data generated by individual and international efforts carries the promise of revolutionizing disease studies and the association of phenotypes with individual polymorphisms. A key challenge is providing an accurate resolution (phasing) of the genotypes into haplotypes. We present here results on a method for genotype phasing in the presence of recombination. Our analysis is based on a stochastic model for recombination-poor regions ("blocks"), in which haplotypes are generated from a small number of core haplotypes, allowing for mutations, rare recombinations, and errors. We formulate genotype resolution and block partitioning as a maximum-likelihood problem and solve it by an expectation-maximization algorithm. The algorithm was implemented in a software package called GERBIL (genotype resolution and block identification using likelihood), which is efficient and simple to use. We tested GERBIL on four large-scale sets of genotypes. It outperformed two state-of-the-art phasing algorithms. The phase algorithm was slightly more accurate than GERBIL when allowed to run with default parameters, but required two orders of magnitude more time. When using comparable running times, GERBIL was consistently more accurate. For data sets with hundreds of genotypes, the time required by phase becomes prohibitive. We conclude that GERBIL has a clear advantage for studies that include many hundreds of genotypes and, in particular, for large-scale disease studies.
منابع مشابه
Estimation of Genotypic Correlation and Heritability of some of Traits in Faba Bean Genotypes Using Restricted Maximum Likelihood (REML)
In order to estimation genotypic correlation and heritability of some faba bean traits, 26 faba bean genotypes were evaluated in a randomized complete block design with three replications during 2014-16 growing seasons in Agricultural Research Sation of Borujerd located in Lorestan province, Iran. The restricted maximum likelihood (REML) was used to estimate the genotypic and phenotypic correla...
متن کاملHBB FSC 36-37 (-T) Gene Mutation Detection in Carriers of Thalassemia Minor Using High Resolution Melting Analysis
Beta-thalassemia is one of the most common autosomal recessive disorders in the world population resulting from over 200 different mutations of HBB gene. Beta-thalassemias are caused by point mutations or, more rarely, deletions in the HBB gene leading to reduced (beta+) or absent (beta0) synthesis of the beta chains of hemoglobin (Hb). High-resolution melting of polymerase chain reaction (PCR)...
متن کاملEvaluation the Mean Performance and Stability of Lentil Genotypes by Combining Features of AMMI and BLUP Techniques
In this research, 12 selective advanced genotypes of lentil with Kimia and Gachsaran checks were grown for three growing years (2010-2013) in four locations including Gachsaran, Gonbad, Khoramabad and Moghan using randomized complete block design with three replicates in each location. The heatmap plot indicated the variation of seed yield of genotypes in different environments. Mosaic plot sho...
متن کاملComparative assessment of the accuracy of maximum likelihood and correlated signal enhancement algorithm positioning methods in gamma camera with large square photomultiplier tubes
Introduction: The gamma cameras, based on scintillation crystal followed by an array of photomultiplier tubes (PMTs), play a crucial role in nuclear medicine. The use of square PMTs provides the minimum dead zones in the camera. The camera with square PMTs also reduces the number of PMTs relative to the detection area. Introduction of a positioning algorithm to improve the spat...
متن کاملMapping the distribution of the main host for plague in a complex landscape in Kazakhstan: An object-based approach using SPOT-5 XS, Landsat 7 ETM+, SRTM and multiple Random Forests
Plague is a zoonotic infectious disease present in great gerbil populations in Kazakhstan. Infectious disease dynamics are influenced by the spatial distribution of the carriers (hosts) of the disease. The great gerbil, the main host in our study area, lives in burrows, which can be recognized on high resolution satellite imagery. In this study, using earth observation data at various spatial s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Proceedings of the National Academy of Sciences of the United States of America
دوره 102 1 شماره
صفحات -
تاریخ انتشار 2005